Overview

Brought to you by YData

Dataset statistics

Number of variables6
Number of observations14230077
Missing cells9
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory651.4 MiB
Average record size in memory48.0 B

Variable types

Text6

Alerts

nconst has unique values Unique

Reproduction

Analysis started2025-03-06 16:20:33.014492
Analysis finished2025-03-06 16:39:00.473486
Duration18 minutes and 27.46 seconds
Software versionydata-profiling vv4.13.0
Download configurationconfig.json

Variables

nconst
Text

Unique 

Distinct14230077
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size108.6 MiB
2025-03-06T11:39:27.473928image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length10
Median length9
Mean length9.4086813
Min length9

Characters and Unicode

Total characters133886260
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14230077 ?
Unique (%)100.0%

Sample

1st rownm0000001
2nd rownm0000002
3rd rownm0000003
4th rownm0000004
5th rownm0000005
ValueCountFrequency (%)
nm0000014 1
 
< 0.1%
nm9993719 1
 
< 0.1%
nm0000001 1
 
< 0.1%
nm0000002 1
 
< 0.1%
nm0000003 1
 
< 0.1%
nm0000004 1
 
< 0.1%
nm0000005 1
 
< 0.1%
nm0000006 1
 
< 0.1%
nm0000007 1
 
< 0.1%
nm0000008 1
 
< 0.1%
Other values (14230067) 14230067
> 99.9%
2025-03-06T11:39:50.389301image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 16114734
12.0%
n 14230077
10.6%
m 14230077
10.6%
0 10350550
7.7%
3 10266384
7.7%
2 10261309
7.7%
4 10188605
7.6%
5 10156454
7.6%
6 10088824
7.5%
7 9368688
7.0%
Other values (2) 18630558
13.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 133886260
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 16114734
12.0%
n 14230077
10.6%
m 14230077
10.6%
0 10350550
7.7%
3 10266384
7.7%
2 10261309
7.7%
4 10188605
7.6%
5 10156454
7.6%
6 10088824
7.5%
7 9368688
7.0%
Other values (2) 18630558
13.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 133886260
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 16114734
12.0%
n 14230077
10.6%
m 14230077
10.6%
0 10350550
7.7%
3 10266384
7.7%
2 10261309
7.7%
4 10188605
7.6%
5 10156454
7.6%
6 10088824
7.5%
7 9368688
7.0%
Other values (2) 18630558
13.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 133886260
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 16114734
12.0%
n 14230077
10.6%
m 14230077
10.6%
0 10350550
7.7%
3 10266384
7.7%
2 10261309
7.7%
4 10188605
7.6%
5 10156454
7.6%
6 10088824
7.5%
7 9368688
7.0%
Other values (2) 18630558
13.9%
Distinct10909569
Distinct (%)76.7%
Missing9
Missing (%)< 0.1%
Memory size108.6 MiB
2025-03-06T11:40:04.576845image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length105
Median length78
Mean length13.510703
Min length1

Characters and Unicode

Total characters192258216
Distinct characters208
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9800087 ?
Unique (%)68.9%

Sample

1st rowFred Astaire
2nd rowLauren Bacall
3rd rowBrigitte Bardot
4th rowJohn Belushi
5th rowIngmar Bergman
ValueCountFrequency (%)
david 134343
 
0.5%
john 126310
 
0.4%
michael 125450
 
0.4%
james 87766
 
0.3%
de 81566
 
0.3%
paul 70447
 
0.2%
robert 69169
 
0.2%
daniel 68968
 
0.2%
chris 68383
 
0.2%
thomas 62618
 
0.2%
Other values (2255596) 28730123
97.0%
2025-03-06T11:40:10.318467image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 19998123
 
10.4%
e 16221563
 
8.4%
15395075
 
8.0%
n 13143503
 
6.8%
i 13048092
 
6.8%
r 12025350
 
6.3%
o 10493790
 
5.5%
l 8836076
 
4.6%
s 6983723
 
3.6%
t 6212808
 
3.2%
Other values (198) 69900113
36.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 192258216
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 19998123
 
10.4%
e 16221563
 
8.4%
15395075
 
8.0%
n 13143503
 
6.8%
i 13048092
 
6.8%
r 12025350
 
6.3%
o 10493790
 
5.5%
l 8836076
 
4.6%
s 6983723
 
3.6%
t 6212808
 
3.2%
Other values (198) 69900113
36.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 192258216
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 19998123
 
10.4%
e 16221563
 
8.4%
15395075
 
8.0%
n 13143503
 
6.8%
i 13048092
 
6.8%
r 12025350
 
6.3%
o 10493790
 
5.5%
l 8836076
 
4.6%
s 6983723
 
3.6%
t 6212808
 
3.2%
Other values (198) 69900113
36.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 192258216
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 19998123
 
10.4%
e 16221563
 
8.4%
15395075
 
8.0%
n 13143503
 
6.8%
i 13048092
 
6.8%
r 12025350
 
6.3%
o 10493790
 
5.5%
l 8836076
 
4.6%
s 6983723
 
3.6%
t 6212808
 
3.2%
Other values (198) 69900113
36.4%
Distinct559
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size108.6 MiB
2025-03-06T11:40:10.956145image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length4
Median length2
Mean length2.0899909
Min length1

Characters and Unicode

Total characters29740731
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique172 ?
Unique (%)< 0.1%

Sample

1st row1899
2nd row1924
3rd row1934
4th row1949
5th row1918
ValueCountFrequency (%)
n 13589764
95.5%
1980 10261
 
0.1%
1981 9966
 
0.1%
1979 9877
 
0.1%
1982 9841
 
0.1%
1978 9740
 
0.1%
1983 9473
 
0.1%
1984 9450
 
0.1%
1977 9169
 
0.1%
1985 9111
 
0.1%
Other values (549) 553425
 
3.9%
2025-03-06T11:40:11.772707image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
\ 13589764
45.7%
N 13589764
45.7%
1 719344
 
2.4%
9 718216
 
2.4%
8 209055
 
0.7%
7 159791
 
0.5%
6 137948
 
0.5%
2 128932
 
0.4%
4 125784
 
0.4%
5 124575
 
0.4%
Other values (2) 237558
 
0.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 29740731
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
\ 13589764
45.7%
N 13589764
45.7%
1 719344
 
2.4%
9 718216
 
2.4%
8 209055
 
0.7%
7 159791
 
0.5%
6 137948
 
0.5%
2 128932
 
0.4%
4 125784
 
0.4%
5 124575
 
0.4%
Other values (2) 237558
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 29740731
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
\ 13589764
45.7%
N 13589764
45.7%
1 719344
 
2.4%
9 718216
 
2.4%
8 209055
 
0.7%
7 159791
 
0.5%
6 137948
 
0.5%
2 128932
 
0.4%
4 125784
 
0.4%
5 124575
 
0.4%
Other values (2) 237558
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 29740731
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
\ 13589764
45.7%
N 13589764
45.7%
1 719344
 
2.4%
9 718216
 
2.4%
8 209055
 
0.7%
7 159791
 
0.5%
6 137948
 
0.5%
2 128932
 
0.4%
4 125784
 
0.4%
5 124575
 
0.4%
Other values (2) 237558
 
0.8%
Distinct502
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size108.6 MiB
2025-03-06T11:40:12.366725image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length4
Median length2
Mean length2.0338623
Min length2

Characters and Unicode

Total characters28942017
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique175 ?
Unique (%)< 0.1%

Sample

1st row1987
2nd row2014
3rd row\N
4th row1982
5th row2007
ValueCountFrequency (%)
n 13989124
98.3%
2021 7614
 
0.1%
2022 7248
 
0.1%
2020 7223
 
0.1%
2023 7006
 
< 0.1%
2024 6291
 
< 0.1%
2019 6100
 
< 0.1%
2018 5871
 
< 0.1%
2016 5762
 
< 0.1%
2017 5742
 
< 0.1%
Other values (492) 182096
 
1.3%
2025-03-06T11:40:13.153450image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
\ 13989124
48.3%
N 13989124
48.3%
2 195697
 
0.7%
0 195441
 
0.7%
1 192286
 
0.7%
9 161643
 
0.6%
8 46973
 
0.2%
7 40025
 
0.1%
6 35297
 
0.1%
4 33930
 
0.1%
Other values (2) 62477
 
0.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 28942017
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
\ 13989124
48.3%
N 13989124
48.3%
2 195697
 
0.7%
0 195441
 
0.7%
1 192286
 
0.7%
9 161643
 
0.6%
8 46973
 
0.2%
7 40025
 
0.1%
6 35297
 
0.1%
4 33930
 
0.1%
Other values (2) 62477
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 28942017
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
\ 13989124
48.3%
N 13989124
48.3%
2 195697
 
0.7%
0 195441
 
0.7%
1 192286
 
0.7%
9 161643
 
0.6%
8 46973
 
0.2%
7 40025
 
0.1%
6 35297
 
0.1%
4 33930
 
0.1%
Other values (2) 62477
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 28942017
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
\ 13989124
48.3%
N 13989124
48.3%
2 195697
 
0.7%
0 195441
 
0.7%
1 192286
 
0.7%
9 161643
 
0.6%
8 46973
 
0.2%
7 40025
 
0.1%
6 35297
 
0.1%
4 33930
 
0.1%
Other values (2) 62477
 
0.2%
Distinct23207
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size108.6 MiB
2025-03-06T11:40:13.537802image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length67
Median length64
Mean length12.197102
Min length2

Characters and Unicode

Total characters173565705
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5598 ?
Unique (%)< 0.1%

Sample

1st rowactor,miscellaneous,producer
2nd rowactress,soundtrack,archive_footage
3rd rowactress,music_department,producer
4th rowactor,writer,music_department
5th rowwriter,director,actor
ValueCountFrequency (%)
n 2784781
19.6%
actor 2517288
17.7%
actress 1615502
 
11.4%
miscellaneous 822202
 
5.8%
producer 487929
 
3.4%
camera_department 439709
 
3.1%
art_department 265522
 
1.9%
writer 230860
 
1.6%
sound_department 222394
 
1.6%
composer 174736
 
1.2%
Other values (23197) 4669154
32.8%
2025-03-06T11:40:14.154016image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 19910679
11.5%
e 19888051
11.5%
t 18427783
10.6%
a 16522430
9.5%
c 12802345
 
7.4%
o 11510696
 
6.6%
s 10612078
 
6.1%
n 7764586
 
4.5%
m 7672207
 
4.4%
i 7292577
 
4.2%
Other values (16) 41162273
23.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 173565705
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
r 19910679
11.5%
e 19888051
11.5%
t 18427783
10.6%
a 16522430
9.5%
c 12802345
 
7.4%
o 11510696
 
6.6%
s 10612078
 
6.1%
n 7764586
 
4.5%
m 7672207
 
4.4%
i 7292577
 
4.2%
Other values (16) 41162273
23.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 173565705
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
r 19910679
11.5%
e 19888051
11.5%
t 18427783
10.6%
a 16522430
9.5%
c 12802345
 
7.4%
o 11510696
 
6.6%
s 10612078
 
6.1%
n 7764586
 
4.5%
m 7672207
 
4.4%
i 7292577
 
4.2%
Other values (16) 41162273
23.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 173565705
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
r 19910679
11.5%
e 19888051
11.5%
t 18427783
10.6%
a 16522430
9.5%
c 12802345
 
7.4%
o 11510696
 
6.6%
s 10612078
 
6.1%
n 7764586
 
4.5%
m 7672207
 
4.4%
i 7292577
 
4.2%
Other values (16) 41162273
23.7%
Distinct5915735
Distinct (%)41.6%
Missing0
Missing (%)0.0%
Memory size108.6 MiB
2025-03-06T11:40:28.515941image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length43
Median length42
Mean length16.169765
Min length2

Characters and Unicode

Total characters230096998
Distinct characters14
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4889807 ?
Unique (%)34.4%

Sample

1st rowtt0072308,tt0050419,tt0027125,tt0031983
2nd rowtt0037382,tt0075213,tt0117057,tt0038355
3rd rowtt0057345,tt0049189,tt0056404,tt0054452
4th rowtt0072562,tt0077975,tt0080455,tt0078723
5th rowtt0050986,tt0069467,tt0050976,tt0083922
ValueCountFrequency (%)
n 1621517
 
11.4%
tt0123338 8258
 
0.1%
tt22014400 7508
 
0.1%
tt6168110 6382
 
< 0.1%
tt0441074 4879
 
< 0.1%
tt0072584 4305
 
< 0.1%
tt0159881 4067
 
< 0.1%
tt11874658 3926
 
< 0.1%
tt0479832 3898
 
< 0.1%
tt4202558 3625
 
< 0.1%
Other values (5915725) 12561712
88.3%
2025-03-06T11:40:37.821397image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 46545878
20.2%
0 22941588
10.0%
1 21138130
9.2%
2 19765024
8.6%
4 17139758
 
7.4%
3 16574438
 
7.2%
8 15936007
 
6.9%
6 15676043
 
6.8%
5 13796997
 
6.0%
7 13496133
 
5.9%
Other values (4) 27087002
11.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 230096998
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
t 46545878
20.2%
0 22941588
10.0%
1 21138130
9.2%
2 19765024
8.6%
4 17139758
 
7.4%
3 16574438
 
7.2%
8 15936007
 
6.9%
6 15676043
 
6.8%
5 13796997
 
6.0%
7 13496133
 
5.9%
Other values (4) 27087002
11.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 230096998
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
t 46545878
20.2%
0 22941588
10.0%
1 21138130
9.2%
2 19765024
8.6%
4 17139758
 
7.4%
3 16574438
 
7.2%
8 15936007
 
6.9%
6 15676043
 
6.8%
5 13796997
 
6.0%
7 13496133
 
5.9%
Other values (4) 27087002
11.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 230096998
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
t 46545878
20.2%
0 22941588
10.0%
1 21138130
9.2%
2 19765024
8.6%
4 17139758
 
7.4%
3 16574438
 
7.2%
8 15936007
 
6.9%
6 15676043
 
6.8%
5 13796997
 
6.0%
7 13496133
 
5.9%
Other values (4) 27087002
11.8%

Missing values

2025-03-06T11:37:16.757053image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-03-06T11:37:33.881400image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

nconstprimaryNamebirthYeardeathYearprimaryProfessionknownForTitles
0nm0000001Fred Astaire18991987actor,miscellaneous,producertt0072308,tt0050419,tt0027125,tt0031983
1nm0000002Lauren Bacall19242014actress,soundtrack,archive_footagett0037382,tt0075213,tt0117057,tt0038355
2nm0000003Brigitte Bardot1934\Nactress,music_department,producertt0057345,tt0049189,tt0056404,tt0054452
3nm0000004John Belushi19491982actor,writer,music_departmenttt0072562,tt0077975,tt0080455,tt0078723
4nm0000005Ingmar Bergman19182007writer,director,actortt0050986,tt0069467,tt0050976,tt0083922
5nm0000006Ingrid Bergman19151982actress,producer,soundtracktt0034583,tt0038109,tt0036855,tt0038787
6nm0000007Humphrey Bogart18991957actor,producer,miscellaneoustt0034583,tt0043265,tt0033870,tt0037382
7nm0000008Marlon Brando19242004actor,director,writertt0078788,tt0068646,tt0047296,tt0070849
8nm0000009Richard Burton19251984actor,producer,directortt0061184,tt0087803,tt0059749,tt0057877
9nm0000010James Cagney18991986actor,director,producertt0029870,tt0031867,tt0042041,tt0034236
nconstprimaryNamebirthYeardeathYearprimaryProfessionknownForTitles
14230067nm9993709Lu Bevins\N\Nproducer,director,writertt17717854,tt11772904,tt11772812,tt11697102
14230068nm9993710Nestor Rudnytskyy\N\N\N\N
14230069nm9993711David Gluzman\N\N\N\N
14230070nm9993712Corny O'Connell\N\N\N\N
14230071nm9993713Sambit Mishra\N\Nwriter,producertt20319332,tt27191658,tt10709066,tt15134202
14230072nm9993714Romeo del Rosario\N\Nanimation_department,art_departmenttt11657662,tt14069590,tt2455546
14230073nm9993716Essias Loberg\N\N\N\N
14230074nm9993717Harikrishnan Rajan\N\Ncinematographertt8736744
14230075nm9993718Aayush Nair\N\Ncinematographertt8736744
14230076nm9993719Andre Hill\N\N\N\N